Reading and Processing .RVG binary files

Hello and welcome to my attempt to read some “simple” binary files.

The goal here:

I need to perform some pre-processing on a set of binary .rvg files (containing trajectory output from a reduced dynamics run of GEODYN) and stitch them together to construct a GEODYN-specific, trajectory-based, tracking data type (called PCE).

Each .rvg file contains a 30-hour arc of ICESat2 trajectory data. These datasets are the output from a very precise run of GEODYN in which the orbit has been determined very well (to a few centimeters accuracy) using a reduced dynamics technique (empirical accelerations and other parameters are adjusted to account for any mismodelled forces).

Methodology for pre-processing the binary files:

  1. dump the data from each arc into some usable format

  2. chop of the 3 hour padding on the ends to eliminate discontinuities from end effects

  3. stitch together all the files (i provide some for this notebook as examples)

  4. smooth over any existing disconiuties between arc gaps or maneuvers.

  5. Put all data into a single TRAJ.txt file to be converted to a GEODYN-specific tracking datatype.

Info on the binary files:

(from the GEODYN folks at Goddard) 1. These files are in what is called the RVG format. The RVG files are pretty simple to unpack (lol) 2. Each record has 29 words 3. Each word is a 64 bit floating point number 4. The first record is a header record with information about the file.

```
#|   Header Record Format:
#|   ---------------------
#|
#|   WORD   | Type | Description
#|   ----     ----   -----------
#|   1         DP     Coord. Sys. Flag
#|                        0 = TOD
#|                        1 = TOR
#|                        2 = J2000
#|   2         DP     Traj start date MJDSEC GPS
#|   3         DP     Traj start frac sec
#|   4         DP     Traj start date (YYMMDDHHMMSS) UTC
#|   5         DP     Traj stop date MJDSEC GPS
#|   6         DP     Traj stop frac sec
#|   7         DP     Traj stop date (YYMMDDHHMMSS) UTC
#|   8         DP     Traj interval sec
#|   9         DP     GEODYN 2s version no.
#|   10        DP     GEODYN 2s run date
#|   11        DP     GEODYN 2s run time
#|   12        DP     GEODYN 2e version no.w
#|   13        DP     GEODYN 2e run date
#|   14        DP     GEODYN 2e run time
#|   15        DP     Speed of light
#|   16        DP     GM for Earth
#|   17        DP     Semi-major axis of Earth ref. ellipsoid
#|   18        DP     Equatorial Flattening of Earth ref. ellipsoid
#|   19        DP     Gravitational Potential Checksum
#|   20        DP     Maximum Degree of Gravitational Expansion
#|   21        DP     Maximum Order Of Gravitational Expansion
#|   22-29     DP       spares
```
  1. The last record is a sentinal record to tell you that you have reached the end of the file. #|   Sentinel Record Format:     #|   ---------------------     #|        #|   WORD | Type | Description     #|   ----   ----   -----------     #|   1       DP     999.0     #|   2       DP     Satellite ID      #|   3       DP     GEODYN IIS Versions     #|   4       DP     GEODYN IIE Versions      #|   5-29    DP     0.0

  • The first word of that record has the value 999.0. In other words, when you encounter a record whose first word has the value 999.0, you know you have reached the end of the file.

  1. All the records in the file except the first and last records, are data records.

#|   Data Record Format:
#|   ---------------------
#|
#|   WORD   | Type | Description
#|   ----     ----   -----------
#|   1         DP     MJDSEC (secs)  % time is in GPS
#|   2         DP     RSEC (fractional secs)
#|   3         DP     UTC - GPS offset (secs)
#|   4         DP     spare_4
#|   5         DP     X Inertial sat. S.Vec (m)
#|   6         DP     Y Inertial sat. S.Vec (m)
#|   7         DP     Z Inertial sat. S.Vec (m)
#|   8         DP     X_dot Inertial sat. S.Vec (m/sec)
#|   9         DP     Y_dot Inertial sat. S.Vec (m/sec)
#|   10        DP     Z_dot Inertial sat. S.Vec (m/sec)
#|   11        DP     Satellite latitude (degrees)
#|   12        DP     Satellite longitude (degrees)
#|   13        DP     Satellite height (m)
#|   14        DP     X-component ECF position (m)
#|   15        DP     Y-component ECF position (m)
#|   16        DP     Z-component ECF position (m)
#|   17        DP     X_dot-component ECF velocity (m/sec)
#|   18        DP     Y_dot-component ECF velocity (m/sec)
#|   19        DP     Z_dot-component ECF velocity (m/sec)
#|   20        DP     X component of polar motion (milliarcsec)
#|   21        DP     Y component of polar motion (milliarcsec)
#|   22        DP     beta angle (degrees)
#|   23        DP     yaw angle (degrees)
#|   24        DP     orbit angle (degrees)
#|   25        DP     Quaternion QI for J2000 to ITRF (ECF)
#|   26        DP     Quaternion 02 for J2000 to ITRF (ECF)
#|   27        DP     Quaternion 03 for J2000 to ITRF (ECF)
#|   28        DP     Quaternion 04 for J2000 to ITRF (ECF)
#|   29        DP     Greenwich HR angle

1) Dump the data into usable format with scipy.io.FortranFile

[1]:
import numpy as np
import pandas as pd
import os


from scipy.io import FortranFile
import numpy as np
import pandas as pd








def read_rvg_binary(record_length, filename):
    '''
    This function converts the RVG trajectory data to a python friendly format.
    Output is a dict that contains the header, data, and sentinal records for a file.

    #------------INFO------------------------
    #
    1. These files are in what is called the **RVG format**. The RVG files are pretty simple to unpack (lol)
    2. Each **record has 29 words**
    3. Each **word is a 64 bit floating point number**
    4. The first record is a *header record* with information about the file.

        ```
        #|   Header Record Format:
        #|   ---------------------
        #|
        #|   WORD   | Type | Description
        #|   ----     ----   -----------
        #|   1         DP     Coord. Sys. Flag
        #|                        0 = TOD
        #|                        1 = TOR
        #|                        2 = J2000
        #|   2         DP     Traj start date MJDSEC GPS
        #|   3         DP     Traj start frac sec
        #|   4         DP     Traj start date (YYMMDDHHMMSS) UTC
        #|   5         DP     Traj stop date MJDSEC GPS
        #|   6         DP     Traj stop frac sec
        #|   7         DP     Traj stop date (YYMMDDHHMMSS) UTC
        #|   8         DP     Traj interval sec
        #|   9         DP     GEODYN 2s version no.
        #|   10        DP     GEODYN 2s run date
        #|   11        DP     GEODYN 2s run time
        #|   12        DP     GEODYN 2e version no.w
        #|   13        DP     GEODYN 2e run date
        #|   14        DP     GEODYN 2e run time
        #|   15        DP     Speed of light
        #|   16        DP     GM for Earth
        #|   17        DP     Semi-major axis of Earth ref. ellipsoid
        #|   18        DP     Equatorial Flattening of Earth ref. ellipsoid
        #|   19        DP     Gravitational Potential Checksum
        #|   20        DP     Maximum Degree of Gravitational Expansion
        #|   21        DP     Maximum Order Of Gravitational Expansion
        #|   22-29     DP       spares
        ```
    5.  The last record is a *sentinal record* to tell you that you have reached the end of the file.
        ```
        #|   Sentinel Record Format:
        #|   ---------------------
        #|
        #|   WORD | Type | Description
        #|   ----   ----   -----------
        #|   1       DP     999.0
        #|   2       DP     Satellite ID
        #|   3       DP     GEODYN IIS Versions
        #|   4       DP     GEODYN IIE Versions
        #|   5-29    DP     0.0
        ```
      - The first word of that record has the value 999.0.
             when you encounter a record whose first word has the value 999.0,  you have reached the end of the file.

    6. All the records in the file except the first and last records, are data records.
    ```
    #|   Data Record Format:
    #|   ---------------------
    #|
    #|   WORD   | Type | Description
    #|   ----     ----   -----------
    #|   1         DP     MJDSEC (secs)  % time is in GPS
    #|   2         DP     RSEC (fractional secs)
    #|   3         DP     UTC - GPS offset (secs)
    #|   4         DP     spare_4
    #|   5         DP     X Inertial sat. S.Vec (m)
    #|   6         DP     Y Inertial sat. S.Vec (m)
    #|   7         DP     Z Inertial sat. S.Vec (m)
    #|   8         DP     X_dot Inertial sat. S.Vec (m/sec)
    #|   9         DP     Y_dot Inertial sat. S.Vec (m/sec)
    #|   10        DP     Z_dot Inertial sat. S.Vec (m/sec)
    #|   11        DP     Satellite latitude (degrees)
    #|   12        DP     Satellite longitude (degrees)
    #|   13        DP     Satellite height (m)
    #|   14        DP     X-component ECF position (m)
    #|   15        DP     Y-component ECF position (m)
    #|   16        DP     Z-component ECF position (m)
    #|   17        DP     X_dot-component ECF velocity (m/sec)
    #|   18        DP     Y_dot-component ECF velocity (m/sec)
    #|   19        DP     Z_dot-component ECF velocity (m/sec)
    #|   20        DP     X component of polar motion (milliarcsec)
    #|   21        DP     Y component of polar motion (milliarcsec)
    #|   22        DP     beta angle (degrees)
    #|   23        DP     yaw angle (degrees)
    #|   24        DP     orbit angle (degrees)
    #|   25        DP     Quaternion QI for J2000 to ITRF (ECF)
    #|   26        DP     Quaternion 02 for J2000 to ITRF (ECF)
    #|   27        DP     Quaternion 03 for J2000 to ITRF (ECF)
    #|   28        DP     Quaternion 04 for J2000 to ITRF (ECF)
    #|   29        DP     Greenwich HR angle
    ```

    '''

    header_titles = [ 'coordinate_system',
                      'Traj_start_date_MJDSEC_GPS' ,
                      'Traj_start_frac_sec' ,
                      'Traj_start_date_YYMMDDHHMMSS_UTC' ,
                      'Traj_stop_date_MJDSEC_GPS' ,
                      'Traj_stop_frac_sec' ,
                      'Traj_stop_date_YYMMDDHHMMSS_UTC' ,
                      'Traj_interval_sec' ,
                      'GEODYN_2s_version_no' ,
                      'GEODYN_2s_run_date' ,
                      'GEODYN_2s_run_time' ,
                      'GEODYN_2e_version_no' ,
                      'GEODYN_2e_run_date',
                      'GEODYN_2e_run_time',
                      'Speed_of_light' ,
                      'GM_for_Earth' ,
                      'Semimajor_axis_of_Earth_ref_ellipsoid' ,
                      'Equatorial_Flattening_of_Earth_ref_ellipsoid' ,
                      'Gravitational_Potential_Checksum' ,
                      'Maximum_Degree_of_Gravitational_Expansion' ,
                      'Maximum_Order_Of_Gravitational_Expansion' ,
                      'spare_22' ,
                      'spare_23',
                      'spare_24',
                      'spare_25',
                      'spare_26',
                      'spare_27',
                      'spare_28',
                      'spare_29',
                      ]

    data_titles = [ 'MJDSEC_secs_timeGPS' ,
                    'RSEC_fractional_secs',
                    'GPS_offset_secs_utc' ,
                    'spare_4' ,
                    'X_statevector_m' ,
                    'Y_statevector_m' ,
                    'Z_statevector_m' ,
                    'XDOT_statevector_m_s' ,
                    'YDOT_statevector_m_s' ,
                    'ZDOT_statevector_m_s' ,
                    'latitude_sat',
                    'longitude_sat',
                    'height_sat_m',
                    'X_ECF_m' ,
                    'Y_ECF_m' ,
                    'Z_ECF_m' ,
                    'XDOT_ECF_m_s' ,
                    'YDOT_ECF_m_s' ,
                    'ZDOT_ECF_m_s' ,
                    'X_polarmotion_milliarcsec',
                    'Y_polarmotion_milliarcsec',
                    'beta_angle',
                    'yaw_angle',
                    'orbit_angle',
                    'Quaternion_QI_J2000_to_ITRF_ECF',
                    'Quaternion_Q2_J2000_to_ITRF_ECF',
                    'Quaternion_Q3_J2000_to_ITRF_ECF',
                    'Quaternion_Q4_J2000_to_ITRF_ECF',
                    'Greenwich_HR_angle',
                    ]

    sentinel_titles = ['delimeter',
                      'Satellite_ID',
                      'G_IIS_vers',
                      'G_IIE_vers',
                      ]


    __rvg_filename = filename
    record_len = record_length

    #### determine the approximate number of records...
    # Open file
    with open(__rvg_filename,'rb') as f:
        b=f.read()      # read in binary file as bytes
    np_data = np.frombuffer(b)  # create a numpy array
    est_num_records = int((np_data.size/29) - 29*2)

    print('There are ~ %i records in this file.' % (est_num_records) )

    #### Save the data as a dictionary with keys to hold
    #          1   header   record,
    #         many  data    records
    #          1   sentinal record
    rvg_data = {}
    rvg_data['header']    = {}
    rvg_data['sentinel']  = {}
    rvg_data['data'] = pd.DataFrame(dict(zip(data_titles,np.ones(record_len)*np.nan) ), index=np.arange(0,est_num_records) )

    f = FortranFile(__rvg_filename, 'r')

    end_data_val = -999.0
    end_datarecord = False
    counter = 0

    ####   Loop through the binary file and save out each full record.
    #      when we encounter the -999.0 delimeter at the start of the sentnial,
    #      we have reached the end of the header record.
    #
    #      The data is saved into a DataFrame for "simplicity"

    while end_datarecord == False:
        a = f.read_record(float)  # read the record with the required datatype
        if end_data_val in a:
            ####  If the the first index has -999.0 we are at the sentinel record
            #     which denotes the end of the data section.
            print('End of record')
            rvg_data['sentinel'] = dict(zip(sentinel_titles, a))
            end_datarecord = True
            counter += 1
            f.close()  # be sure to close the file
            break
        else:
            if counter == 0:
                #### If the counter is 0 we are on the header record.
                #    this is simply because it is the first record. bottabing bottaboom
                rvg_data['header'] = dict(zip(header_titles, a))
            else:
                #### Everything in the file that isn't header or senitinel is data
                rvg_data['data'].loc[counter-1] = dict(zip(data_titles,a) )
            counter += 1
    # remove the extra NANs that were used to initialize the dataframe
    rvg_data['data'] = rvg_data['data'].dropna(axis=0 ,how='all')
    return(rvg_data)

[2]:
path_to_binaryrvg = '/data/data_geodyn/inputs/icesat2/pre_processing/traj_files_rvg/'

os.chdir(path_to_binaryrvg)
os.system('gunzip -vr orbit.1807001.2018.287.gz')
os.system('gunzip -vr orbit.1807001.2018.288.gz')


rvg_data1 = read_rvg_binary(29, path_to_binaryrvg + 'orbit.1807001.2018.287')
rvg_data2 = read_rvg_binary(29, path_to_binaryrvg + 'orbit.1807001.2018.288')

There are ~ 109434 records in this file.
End of record
There are ~ 109434 records in this file.
End of record
[3]:
os.system('gzip -v orbit.1807001.2018.287')
os.system('gzip -v orbit.1807001.2018.288')

[3]:
0
[4]:
rvg_data1['header']
[4]:
{'coordinate_system': 2.0,
 'Traj_start_date_MJDSEC_GPS': 2454182298.0,
 'Traj_start_frac_sec': 0.0,
 'Traj_start_date_YYMMDDHHMMSS_UTC': 1181013211800.0,
 'Traj_stop_date_MJDSEC_GPS': 2454288138.0,
 'Traj_stop_frac_sec': 0.0,
 'Traj_stop_date_YYMMDDHHMMSS_UTC': 1181015024200.0,
 'Traj_interval_sec': 1.0,
 'GEODYN_2s_version_no': 1802.0,
 'GEODYN_2s_run_date': 20201125.0,
 'GEODYN_2s_run_time': 125600.0,
 'GEODYN_2e_version_no': 1802.0,
 'GEODYN_2e_run_date': 100519.0,
 'GEODYN_2e_run_time': 20201125133317.0,
 'Speed_of_light': 299792458.0,
 'GM_for_Earth': 398600441500000.0,
 'Semimajor_axis_of_Earth_ref_ellipsoid': 6378136.3,
 'Equatorial_Flattening_of_Earth_ref_ellipsoid': 0.0033528199227242064,
 'Gravitational_Potential_Checksum': -0.00047889192726204564,
 'Maximum_Degree_of_Gravitational_Expansion': 0.0,
 'Maximum_Order_Of_Gravitational_Expansion': 0.0,
 'spare_22': 0.0,
 'spare_23': 0.0,
 'spare_24': 0.0,
 'spare_25': 0.0,
 'spare_26': 0.0,
 'spare_27': 0.0,
 'spare_28': 0.0,
 'spare_29': 0.0}

1.a Look at the data for a few files to see the discontinuities.

Convert MJDS to YYMMDDHHMMSS

[5]:


def MJDS_to_YYMMDDHHMMSS(input_ModJulianDay_secs):
    '''
    This function takes modified julian day seconds (MJDS) as input
    and returns a date_string in the format YYMMDDHHMMSS.
    '''

    ### Modified Julian Date conversion
    # *  MODIFIED JULIAN DAY = JULIAN DAY - GEODYN REFERENCE TIME IN JD

    #########################################
    # Define some constants
    SECDAY              = 86400
    geodyn_ref_time_mjd = 30000
    jd_0                = 2400000.5
    d36525              = 365.25
    d122                = 122.1
    d30600              = 30.6001
    half                = 0.5
    ib = -15
    d17209 = 1720996.5

    ######  CONVERT FROM MJDS TO MJD
    # Inputs:
    MJDS = input_ModJulianDay_secs
    #
    MJD = (MJDS/SECDAY) + geodyn_ref_time_mjd

    ######  CONVERT FROM MJD TO YMD
    # Note from zach-- I took this calculation from geodyn...
    # There is more going on here than I understand,
    # but I want to stay on their level of accuracy
    #
    JD = MJD + jd_0                  # Convert to JulianDay
    c  = int( JD + half ) + 1537     # ??   sorry, i'm   ??
    nd = int( (c - d122) / d36525 )  # ??   not sure     ??
    e  = int( d36525 * nd )          # ??   what this    ??
    nf = int( ( c - e ) / d30600 )   # ??   all is       ??
    # ----
    frac = (JD + half) - int( JD + half )           # frac of day leftover
    iday = c - e - int( d30600 * nf ) + frac        # day
    imonth  = nf -  1   - 12 * int( nf / 14 )       # month
    iyyyy = nd - 4715 - int(  ( 7 + imonth ) / 10 ) # YYYY
    #
    ##### Use modular division to get 2 digit year
    iyear =  iyyyy % 100
    #
    #### Return YYMMDD
    yymmdd = int(iyear * 10000 + imonth * 100 + iday)



#     ##### Calculate Hours, Minutes, seconds


    ##### Calculate Hours, Minutes, seconds
    isec_mjd  =  MJDS % 86400

    ihour     = isec_mjd/3600
    iminutes = (ihour % 1)*60
    isec = (iminutes % 1)*60

    isec_str      = str(int(isec))
    ihour_str = str(int(ihour))
    iminutes_str  = str(int(iminutes))

    if len(ihour_str)==1:
        ihour_str = '0'+ihour_str
    if len(iminutes_str)==1:
        iminutes_str = '0'+iminutes_str
    if len(isec_str)==1:
        isec_str = '0'+isec_str

    #hhmmss  =  int((ihour*10000) + (iminutes*100) + isec)
    hhmmss  =  ihour_str + iminutes_str + isec_str

#     print('mjd',MJDS)
#     print('isec_mjd',isec_mjd)
#     print('--')
#     print('ihour', str(ihour))
#     print('ihourstr', ihour_str)
#     print()
#     print('iminutes', iminutes)
#     print('iminutes_str', iminutes_str)
#     print()
#     print('isec', isec)
#     print('isec_str', isec_str)
#     print('HHMMSS', str(hhmmss))
#     print()
#     print('------------------------')

    YYMMDDHHMMSS = str(yymmdd) + '-' + str(hhmmss)
#     print(MJDS)
#     print('YYMMDD', str(yymmdd))
#     print('HHMMSS', str(hhmmss))
#     print()
    return(YYMMDDHHMMSS)


[6]:


#### TEST to see if the time conversion works
# YYMMDDHHMMSS = MJDS_to_YYMMDDHHMMSS(start_date_MJDSEC_GPS)
# print(pd.to_datetime( YYMMDDHHMMSS, format='%y%m%d-%H%M%S'))
# print(pd.to_datetime( str(int(start_date_YYMMDDHHMMSS_UTC)), format='1%y%m%d%H%M%S'))

def RVG_Files_add_datetime_column(rvg_file):

    # data
    mjdsecs = rvg_file['data']['MJDSEC_secs_timeGPS']

    #convert the MJDS to a usable string.
    yymmdd_str = [MJDS_to_YYMMDDHHMMSS(x) for x in mjdsecs]

    # convert string of dates to datetime for plotting
    dates      = [pd.to_datetime( x, format='%y%m%d-%H%M%S') for x in yymmdd_str]

    return(dates)


dates1 = RVG_Files_add_datetime_column(rvg_data1)
dates2 = RVG_Files_add_datetime_column(rvg_data2)

#add the datetime string the the Dataframe within the dictionary
rvg_data1['data'].insert(0, 'Date', dates1)
rvg_data2['data'].insert(0, 'Date', dates2)


We can see below that each file is 29.40027 hours long

[7]:
starttime = rvg_data1['data']['Date'].iloc[0]
endtime = rvg_data1['data']['Date'].iloc[-1]
print('Start: ',starttime)
print('end  : ',endtime)

time_diff_days = pd.Timedelta(endtime - starttime).days * 24
time_diff_hours = pd.Timedelta(endtime - starttime).seconds / 3600.0
time_diff_totalhours = time_diff_days + time_diff_hours

print('Each file contains' ,time_diff_totalhours,'hours worth of data')
Start:  2018-10-13 21:18:17
end  :  2018-10-15 02:42:18
Each file contains 29.400277777777777 hours worth of data

Find the overlap between two files:

[8]:
from collections import namedtuple
Range = namedtuple('Range', ['start', 'end'])

def time_overlap(file1, file2, verbose=False):
    data1 = file1['data']['Date']
    data2 = file2['data']['Date']
    r1 = Range(start=data1.iloc[0], end=data1.iloc[-1])
    r2 = Range(start=data2.iloc[0], end=data2.iloc[-1])
    latest_start = max(r1.start, r2.start)
    earliest_end = min(r1.end, r2.end)
    delta = (earliest_end - latest_start).total_seconds()/3600
    overlap = (delta)

    if verbose:
        print('Latest_start',latest_start)
        print('Earliest_end',earliest_end)
        print('Overlap of ',overlap, 'hours')

    return latest_start, earliest_end, overlap

(start_overlap, end_overlap, tot_overlap) = time_overlap(rvg_data1, rvg_data2, verbose=True)
Latest_start 2018-10-14 21:18:17
Earliest_end 2018-10-15 02:42:18
Overlap of  5.400277777777778 hours
[9]:
import plotly.graph_objects as go
from plotly.subplots import make_subplots
import plotly.express as px

fig = make_subplots(rows=1, cols=1,
                   )
string = 'X_statevector_m'
fig.add_trace(go.Scattergl(x=rvg_data1['data']['Date'].iloc[-20000:],
                           y=rvg_data1['data'][string].iloc[-20000:]*1e2,
                           name= "set 1",
                           mode='markers',
                           marker=dict(
                           size=6,),
                           ),
                            row=1, col=1,
                           )
fig.add_trace(go.Scattergl(x=rvg_data2['data']['Date'].iloc[:20000],
                           y=rvg_data2['data'][string].iloc[:20000]*1e2,
                           name= "set 2",
                           mode='markers',
                           marker=dict(
                           size=3,),
                           ),
                            row=1, col=1,
                           )


fig.update_layout(
    title="rvg files 1 and 2 overlap ",
    yaxis_title="state vector C [cm]",
    xaxis_title="Date",
    )

fig.update_layout(legend= {'itemsizing': 'constant'})
fig.update_xaxes(tickangle=30)
fig.update_xaxes( exponentformat= 'power')
fig.update_yaxes( exponentformat= 'power')

fig.show()


We want to cut off the same amount of time from each file such that they line up perfectly

[10]:

print('Cut same amount off each end of file.  So, ',tot_overlap/2, 'hours from each end of file')
Cut same amount off each end of file.  So,  2.700138888888889 hours from each end of file
[11]:

rvg_data1['data']['Date']
[11]:
0        2018-10-13 21:18:17
1        2018-10-13 21:18:19
2        2018-10-13 21:18:20
3        2018-10-13 21:18:20
4        2018-10-13 21:18:21
                 ...
105836   2018-10-15 02:42:13
105837   2018-10-15 02:42:15
105838   2018-10-15 02:42:15
105839   2018-10-15 02:42:16
105840   2018-10-15 02:42:18
Name: Date, Length: 105841, dtype: datetime64[ns]
[12]:

print('Last time', rvg_data1['data']['Date'].iloc[-1] )

rvg_data1['data']['Date'].iloc[-1] - pd.Timedelta(tot_overlap/2, unit='hours')
Last time 2018-10-15 02:42:18
[12]:
Timestamp('2018-10-15 00:00:17.499999600')
[13]:
def RVGfiles_timeoverlap_GetChoppingTime(file1, file2):

    (_, _, tot_overlap) = time_overlap(file1, file2)

    file1_start = file1['data']['Date'].iloc[0]
    file1_end = file1['data']['Date'].iloc[-1]

    file2_start =  file2['data']['Date'].iloc[0]
    file2_end =  file2['data']['Date'].iloc[-1]

    file1_new_start = file1_start + pd.Timedelta(tot_overlap/2, unit='hours')
    file1_new_end = file1_end - pd.Timedelta(tot_overlap/2, unit='hours')

    file2_new_start = file2_start + pd.Timedelta(tot_overlap/2, unit='hours')
    file2_new_end = file2_end - pd.Timedelta(tot_overlap/2, unit='hours')

#     print('file1_last:  ',file1_last)
#     print('file2_start: ',file2_start)

#     print('file1_new_last:  ',file1_new_last)
#     print('file2_new_start: ',file2_new_start)

    return(file1_new_start, file1_new_end, file2_new_start, file2_new_end)



[14]:
file1_new_start, file1_new_end, file2_new_start, file2_new_end = RVGfiles_timeoverlap_GetChoppingTime(rvg_data1, rvg_data2)
print('file1_new_start', file1_new_start)
print('file1_new_end', file1_new_end)
print('file2_new_start', file2_new_start)
print('file2_new_end', file2_new_end)
file1_new_start 2018-10-14 00:00:17.500000400
file1_new_end 2018-10-15 00:00:17.499999600
file2_new_start 2018-10-15 00:00:17.500000400
file2_new_end 2018-10-16 00:00:17.499999600
[15]:

def RVGfiles_chop_the_ends(file1, file2):
    '''
    Chop the ends off the file.
    '''

    file1_new_start, file1_new_end, file2_new_start, file2_new_end = RVGfiles_timeoverlap_GetChoppingTime(rvg_data1, rvg_data2)


    df1 = file1['data']
    df2 = file2['data']

    ##### Chop the FRONT off of the FIRST file
    # select all the values greater than our new start and grab the last one
    val1_front = df1.Date[df1.Date < file1_new_start].iloc[-1]
    indx1_front = df1.Date[df1.Date==val1_front].index.unique()[0]

    ##### Chop the END off of the FIRST file
    # select all the values less than our new start and grab the first one
    val1_end = df1.Date[df1.Date > file1_new_end].iloc[1]
    indx1_end = df1.Date[df1.Date==val1_end].index.unique()[0]



    ##### Chop the FRONT off of the SECOND file
    # select all the values greater than our new start and grab the last one
    val2_front = df2.Date[df2.Date < file2_new_start].iloc[-1]
    indx2_front = df2.Date[df2.Date==val2_front].index.unique()[0]

    ##### Chop the END off of the SECOND file
    # select all the values less than our new start and grab the first one
    val2_end = df2.Date[df2.Date > file2_new_end].iloc[1]
    indx2_end = df2.Date[df2.Date==val2_end].index.unique()[0]

#     print('indx1_front',indx1_front)
#     print('indx1_end',indx1_end)
#     print('indx2_front',indx2_front)
#     print('indx2_end',indx2_end)

    df1_new = df1[:indx1_end][indx1_front+1:] # add one index so there is no overlap in time
    df2_new = df2[:indx2_end][indx2_front+1:] # add one index so there is no overlap in time

    return(df1_new, df2_new)


[16]:
(df1_new, df2_new) = RVGfiles_chop_the_ends(rvg_data1, rvg_data2)

Plot again to see how we did

[17]:
import plotly.graph_objects as go
from plotly.subplots import make_subplots
import plotly.express as px

fig = make_subplots(rows=1, cols=1,
                   )
string = 'X_statevector_m'
fig.add_trace(go.Scattergl(x=df1_new['Date'],
                           y=df1_new[string]*1e2,
                           name= "adjusted set 1",
                           mode='markers',
                           marker=dict(
                           size=6,),
                           ),
                            row=1, col=1,
                           )
fig.add_trace(go.Scattergl(x=df2_new['Date'],
                           y=df2_new[string]*1e2,
                           name= "adjusted set 2",
                           mode='markers',
                           marker=dict(
                           size=3,),
                           ),
                            row=1, col=1,
                           )


fig.update_layout(
    title="Adjusted Overlap on files 1 and 2 ",
    yaxis_title="state vector X [cm]",
    xaxis_title="Date",
    )

fig.update_layout(legend= {'itemsizing': 'constant'})
fig.update_xaxes(tickangle=30)
fig.update_xaxes( exponentformat= 'power')
fig.update_yaxes( exponentformat= 'power')

fig.show()


Construct G2B files

Determine files that will go into each arc (files between maneuvers)

SATPAR   139     1807001          9.53000000       1514.000
EPOCH               181013210000.0000000181013210000.00000001810160300 00.000

The next steps in constucting the g2b_PCE geodyn input is to construct an ascii text file and run it through the ``pce_converter.f``

Steps to construct the g2b binary file: 1. Load several (sample) days of RVG files 2. Process and combine them into an ascii file titled TRAJ.txt 3. Feed TRAJ.txt into the PCE_converter.f. The fortran code expects the following: 1. A file titled TRAJ.txt - Each record (line) of the TRAJ.txt file has one integer and four floating point words. - The integer and the first floating point word form the time tag of the record. - The integer of TRAJ.txt is just the first word of the RVG data record converted to an integer. - The first floating point of TRAJ.txt is just the sum of words 2 and 3 of the RVG data record. - The last three words are the X, Y and Z coordinates of the satellite in the J2000 coordinate system.
- The X, Y and Z values are just words 5, 6 and 7 of the RVG data record.

All auxilliary files will continue to be hosted in the path, /data/data_geodyn/inputs/icesat2/pre_processing/

[18]:
###### Notes:
# SATID = 1807001
# epoch_start = '181013210000'
##              YYMMDDHHMMSS.mls0000cd
# epoch_end   = '181016030000'
##              YYMMDDHHMM SS.mls

# 54 hours (2 days 6 hours)
# Oct 13- 2100  DOY=287
# Oct 14-       DOY=288
# Oct 15-       DOY=289
# Oct 16- 0300  DOY=290



[19]:
arc1_files = ['orbit.1807001.2018.287' ,
              'orbit.1807001.2018.288' ,
              'orbit.1807001.2018.289A',
              'orbit.1807001.2018.290A',
              'orbit.1807001.2018.291',
#               'orbit.1807001.2018.292',
#               'orbit.1807001.2018.293',
#               'orbit.1807001.2018.294',
             ]

We will do this in a class using ``pygeodyn`` so that much of the work to call the code is behind-the-scenes

Construct ``TRAJ.txt`` as an ascii file

[20]:
# %load_ext autoreload
# %autoreload 2

# import sys
# sys.path.insert(0,'/data/geodyn_proj/pygeodyn/utils_pygeodyn_develop/')
# from pre_processing import pygeodyn_PreProcessing


# path_to_preprocessing = '/data/data_geodyn/inputs/icesat2/pre_processing'
# path_to_binaryrvg     = '/data/data_geodyn/inputs/icesat2/pre_processing/traj_files_rvg'
# Obj = pygeodyn_PreProcessing(path_to_binaryrvg, path_to_preprocessing,  arc1_files)

# # Obj.get_timechopped_rvgdata()

# Obj.make_ascii_traj_txt_file_for_pcefortran()

Call the fortran code with F2PY

[ ]:

[21]:
# import subprocess
# import os

# path_to_data = '/data/data_geodyn/inputs/icesat2/pre_processing/'

# path_to_pygeodyn = '/data/geodyn_proj/pygeodyn/utils_pygeodyn_develop/util_preprocessing/'

# ### CHANGE DIRECTORY to where the fortran code is hosted
# os.chdir(path_to_pygeodyn)

# #### Compile the pce code:
# command_1 = './compile_pce_f.sh'
# subprocess.run(command_1, shell = True)


[22]:
# command_2 = './ExecutePCE.exe > out_pce 2> err_execute'
# subprocess.run(command_2, shell = True)

[23]:
# %load_ext autoreload
# %autoreload 2
# import sys
# sys.path.insert(0,'/data/geodyn_proj/pygeodyn/utils_pygeodyn_develop/util_preprocessing/')
# from pre_processing import pygeodyn_PreProcessing

# path_to_preprocessing = '/data/data_geodyn/inputs/icesat2/pre_processing'
# path_to_binaryrvg     = '/data/data_geodyn/inputs/icesat2/pre_processing/traj_files_rvg'
# Obj = pygeodyn_PreProcessing(path_to_binaryrvg, path_to_preprocessing,  arc1_files)


# Obj.call_fortran_pce_converter()
[24]:
# !f2py -c -m pce_converter pce_converter.f
# !f2py -h --overwrite-signature fortran_pce.pyf  pce_converter.f
[ ]: